Goto

Collaborating Authors

 real-time manipulation detection


Hide-and-Shill: A Reinforcement Learning Framework for Market Manipulation Detection in Symphony-a Decentralized Multi-Agent System

Shi, Ronghua, Liu, Yiou, Ying, Xinyu, Tan, Yang, Feng, Yuchun, Ai, Lynn, Shi, Bill, Wang, Xuhui, Liu, Zhuang

arXiv.org Artificial Intelligence

Decentralized finance (DeFi) has ushered in a new era of permissionless financial innovation--but also opened the door to discourse-driven market manipulation at unprecedented scale. Without centralized gatekeepers or regulatory oversight, malicious actors now coordinate shilling campaigns and pump-and-dump schemes across social platforms and on-chain ecosystems. We propose Hide-and-Shill, a novel Multi-Agent Reinforcement Learning (MARL) framework for decentralized manipulation detection. By modeling the interaction between manipulators and detectors as a dynamic adversarial game, the framework learns to identify suspicious discourse patterns using delayed token price reactions as ground-truth financial signals. Our method introduces three key innovations: (1) Group Relative Policy Optimization (GRPO) to improve learning stability in sparse-reward and partially observable settings; (2) a theory-grounded reward function inspired by rational expectations and information asymmetry, distinguishing price discovery from manipulation-induced noise; and (3) a multi-modal agent pipeline that fuses LLM-based semantic features, social graph signals, and on-chain market data for informed decision-making. T o support scalable and trustless deployment, our framework is integrated within the Symphony system--a decentralized multi-agent coordination architecture that enables peer-to-peer agent execution, trust-aware learning through distributed logs, and chain-verifiable evaluation. Symphony facilitates adversarial co-evolution among strategic actors and maintains robust manipulation detection without reliance on centralized oracles, empowering real-time surveillance across global DeFi discourse ecosystems. Trained on 100,000 real-world discourse episodes and validated in adversarial co-evolution simulations, Hide-and-Shill achieves state-of-the-art performance in both detection accuracy and causal attribution.